Greg Detre
Tuesday, April 08, 2003
is
Behavioural Processes a reputable journal???
are there many other behaviourist publications??? what�s its mandate???
This was a
curious paper, not least of all because I still don't know what it's about.
Complementarity
theory - "complementarity occurs whenever some quantity is oconserved'.
This might be knowledge (as in the case of position and momentum at the
quantum???} level, field of view vs magnification???,� mental resources, physical resources, precision vs clarity/amount
of data needed/expedience...
In the case
of our processing abilities, we need to be aware of complementarity to avoid
summarily dismissing theories because they are too complex, or too superficial.
"Scientists
have yet to develop a set of techniques for chagning the field fo view fo a
theory while guaranteeing connectedness through the process: theroetical depth
of focus is discrete, not continuous." Is there any hope that such
domain-general techniques could ever exist??? Is this not simply another way of
restating the epistemological problem of (weak) emergence - that is, we cannot
keep track of many interacting components (even if the rules of their
interactions are simple, and especially if they are weak). Yes, this is what he
says when he writes "non-linear interactions, however, give rise to
'emergent phenomenal'. I like the idea that all models should ideally
demonstrate that they preserve phenomena one level up and one level down. Can
we clarify this notion of level???
I need to
look up conditioning, reinforcement, operant vs classical, multiple schedules
and Garcia??? see pg 34.
Complementarity
of explanation??? pg 35
I like the idea
of many types of explanation. Is there any problem/misreading in his paraphrase
of 'cause' with 'explanation'???
Write to a
Greek scholar re this.
Aristotle's
four -be-causes:
- efficient
- occur before an event and/or(???) trigger it - sufficient causes - , or their
absence prevents it - necessary causes -. Usual sense of 'cause'.
Skinner's
variables of which behaviour is a function???
- material
- substrates, underlying mechanisms. "Assertions taht they are the best or
only kind of explanation is reductionism".
- final -
the reason an entity/process exists, what it does that has jsutified its
existence.
Skinner's
selection by consequences???
"Assertion
that final causes are time-reversed efficient causes is teleology: results
cannot bring about their efficient causes.
When you
ask what a strange machine does, you are seeking a final cause. Try and fit
with Marr's three levels??? Final causes can be proximal or distant, e.g.
Evolutionary pressures vs a history of reinforcement or intentions.
- formal causes
- analogues (sp???), metaphors and models, the structures with which we
represent phenomena and which permit us to predict and control them. e.g. The
syllogism, or the differnetial equation, molecular model.
Skinnerian
three-term contingency???
"all understanding
involves finding an apprpriate formal cause - that is, mapping phenomena to
explanations having a similar structure to the thing explained." Hmmm. How
is this different from the material cause??? This may be what we mean by
internalisation, or the process by which we learn, but 'understanding' implies
both these - very similar - senses, but also the process by which we maintain a
conceptual representation. After all, when we understand a concept, we may want
to then use it later as an analogue for some even newer concept - we don't
necessarily do this each time by recursive reference to the previous underlying
concept or analogue. If nothing else, this would lead to an infinite regress...
Interesting:
causal, reductive, functional and formal explanations...
Could the
difference between material and formal causes lie in the formality of their
respective representations??? Or is it that the material is to do with actual
physical types of substrates/materials, whereas the formal is to do with the actual
algorithms perform, while the final is to do with why it does those algorithms
or what role the inputs and outputs play in some larger system, i.e. In terms
of why and purpose rather than how.
"A
formal explanation proceeds by apprehending the event to be explained and
placing it in correspondence with a model". Model identifies necessary or
sufficient antecedents for the event. If those are found in the empirical
realm, the phenomenon is said to be explained. Confounds must also be
eliminated.
Find out
more about Behavioural Processes??? Are there really still behaviourists still
around??? Is there any way to restate their position to make it plausible, less
strong??? Could it be that it's still worth talking occasionally in these
behaviourist terms if we're just thinking about one agent/component among many
which happens to use some behaviourist learning paradigms.
Analog vs
analogy - etymology???
Distinction
between post- and pre-diction.
Control:
the user of a model introduces a variable known to bring about a certain
effect.
Explanation
is "thus the provision of a model with -a- the events to be explained as a
consequent, and -b- events noticeable in the environment as antecedents".
Isn't this just material/efficient explanation though???
Control is
"the arrangement of antecedents in teh context of a model that increases
the probability of desired consequences"
Truth is
"a state of correspondence between models and data". But you need to
specify both the model and the data it attempts to map. Hmmm??? "Finding
ways to make models applicable to apparently diverse phenomena is part of the
creative action fo science." You not only have to map the variables to
their empirical instantiations, but also the operators.
"Life
is sacred, except in war; war is bad, except when fought for justice; justice
is good, except when untempered by humanity".
Truth is
binary, but precision is graded
jejune???
False
models may apparently be more useful than true ones, e.g. Newtonian mechanics.
Hmm. "It is trivial to show a modle false; restricting the domain of the
model or modifying the form of the model to make it truer is the real
accomplishment."
modeling
tools are formal structures, whereas models are such tools applid to a data
domain.
"Behavioural
science is a search in the empirical domain for the variables of which
behaviour is a function; and a search in the theoretical domain for the
functions according to which behaviour varies (= models)"???
Don't like
his definition of categorisation - although it's not obvious that it is a
definition of all categorisation. See pg 38.
Discussion
of reinforcement and the law of effect???
Skinner:
the reflex must be viewed in set-theoretic terms. Reinforcement acts to
strengthen movements of a similar kind.
"Operant
responses are those movements whose occurrence is correlated with prior
(discriminative) stimuli and subsequent (reinforcing) stimuli."
reinforce a
reminder using language - interesting... Pg 39
sets vs
family resemblances - can behaviourism deal with these???
Operant???
Blocking???
Figure 2???
Don't you need to represent these learning paradigms with before and after
shots???
All
probabilities are conditional on a universe of discourse - our goal is define
the relevant universe - context - the smae way as the animal does, so that our
model predicts its behaviour.
Ford effect
in politics - pg 42
Are the
formulations he gives of Bayes theorem the same as the normal ones??? Don't
they usually involve intersection???
Considers
the value of a Bayesian inference, given the probability of the
antecedents...??? Pg 43
"What
is the probability that the model in question accounts for more of the variance
in the empirical data than does some other model of equal complexity?"
Am I being
stupid about the axes on pg 44???
Shepard's
non-metric multidimensional scaling???
Change is
fundamental, and calculus is the language of change
"Behavoiur
is change in stance over time. Skinner: behaviour is "the movement of an
organism within a frame of reference"
how
interesting is this to me as a non-behaviourist??? what's neo-behaviourism???
Is he a neo-behaviourist??? Is that a purely philosophical position on the
mind-body problem???
"Finding
the right balance in life is finding the point at which life-satisfaction has
zero-derivatives", i.e. the factors which affect life-satisfaction cannot
be increasd without affecting another negatively, right??? Pg 45
elegant
generalisations in calculus: not to find a point on a function that minimises
some value, but rather to find a -function- that minimises some value, e.g. The
brachistochrone problem - what is the shape of a surface will cause a ball
rolling down the surface to arrive at the bottom in a minimum time? Posed by
Bernoulli, five solutions, including Newton's, himself, his brother, one of
their students and Leibniz
Decision
theory "is a way of combining measures of stimuli and reinforcers to
predict which response will be most strongly reinforced"
is this
what learning in real brains is actually about though??? pg 47
fig 6:
"the multi-dimensional signal-detection/optimisation problem faced by real
organisms: how to access a particular reinforcer"
closed loop
environments - when actions change the environment, which in turn changes
future actions
open vs
closed loop??? open is where your actions don't affect the world feeding back
into your sensorium, right??? I think so.
If
signalling is possible, players will seek signs of character - that is,
predictors of future behaviour - signal detection becomes a survival skill.
"Organisms
such as rats and humans may be viewed as finite-state automata, differing
primarily in the amount of memory that is available to them. This statement
does not mean that they are nothing but automata." When he says that this
doesn't mean they're nothing but automata, what else??? Perhaps he's saying
that they could also be modelled as more powerful types of automata... No, I
don't think so.
"More
memory means more capacity to retain and relate conditional probabilities.
Enhanced ability to conditionalise permits nuanced reactions." No no no.
"The
mind of science may be claimed by philosophy, but its heart belongs to
tinkerers and problem-solvers."
"The
difference between scientists and anagram fans is the idea that scientific
problems are part of a larger puzzle set; that one puzzle solved may make other
pieces fall into place. Bu tthen jigsaw puzzles have that feature too."
Hmmm. But they don't make other jigsaw puzzles fall into place. Hmmm, well,
science is compartmentalised to some degree too. That's what abstraction is all
about. I don't really understand his point about the jigsaw puzzles, other than
maybe as a self-deprecation about the appositeness of his own analogy.
"The
goal of science is not perfect modles, because the only perfect renditions are
teh phenomena sui generis; the goal is better models". Hmm, isn't this
begging the question, i.e. That there is no such thing as a perfect model???
"The
more laconic amodel, the more likely we can extrapolate its prediction to new
situations without substantial tinkering. The most succinct models are called
elegant."
Argh. He's
got the apostrophe in 'it's' wrong. Pg 50
�dogs are
only able to learn causality if the events, actions and consequences are proximate
in space and time, and as long as the consequences are motivationally
significant�
animals and
their trainers act as a coupled system to guide the animal�s exploration of its
state, action and state-action spaces
take advantage of preditable regularities
maximal use of any supervisory signals (implicit or explict)
make them easy to train by humans
the
synthetic dog mimics some of a real dog�s ability to learn including:
the best action to perform in a given context
what form of a given action is most reliable in producing reward
the relative reliability of its actions in producing a reward and
altering its choice of action accordingly
to recognise new and valuable contexts such as acoustic patterns
to synthesise new actions by being �lured� into novel configurations or
trajectories by the trainer
the
behavioural architecture is one in which learning can occur, rather than an
architecture that solely performs learning???
�integrated
appraoch to state, action and state-action space discovery within the context
of reinforcement learning and an articulation of heuristics and design
principles that make learning practical for synthetic characters�
most
approaches to generating motor primitives focus on learning �how to move�
subject to some criteria such as energy minimisation
they focus on learning the �value with respect to a motivational goal of
moving in a certain way�
state = �a
specific, hopefully useful configuration fothe world as sensed by the
creature�s entire sensory system. As such, state can be thought of as a label
that is assigned to a sensed configuration. The space of all represented
configurations of the world is the state space�
action =
�how a creature can affect the state of its world� � finite set of actions, one
at a a time, action space is the set of all possible actions
state/action
pair = <S/A>, relationship between a state S and an action A. �typically
accompanied by some numerical value, e.g. future expected reward, that
indicates how much benefit there is in taking the action A when the creature
senses state S. Based on this relationship a policy is built, which
represents a probability with which the creature selects an action given a
specific state�
credit
assignment = �the process of updating the associated value of a state-action
pair to reflect its apparent utility for ultimately receiving award�
animals are
biased to learn proximate causality
Leyahusen
suggests that the individual actions may be largely self-reinforcing, rather
than being inforced via back-propagation
does
Q-learning work in a dynamic environment???
clicker
training with real animals???
clicker
training:
1.
create
an association between the sound of a toy clicker and a food reward � then use
the click sound to �mark� behaviours that they wish to encourage
animals assume that an action/stimulus immediately preceding a
motivationally significant consequence is �as good as causal�
easy to provide immediate feedback � bridges the dog earning and
receiving the reward
2.
in
order to get the dog to first produce the desired behaviour so that it can be
rewarded, the trainer encourages the dog to perform specific behaviours, e.g.
training the dog to touch an object such as the trainer�s hand or a �target
stick�, luring the dog through a trajectory or into a pose
the animal can learn to associate reward with its resulting body
configuration/trajectory, and not just the action of following its nose
shaping = when the trainer guides the dog towards the desired behaviour
by rewarding ever-closer approximations
3.
add a
discriminative stimulus (e.g. gesture or vocal cue), usually trained by being
issued just after the animal has started to perform the action
teaching the action first and then the cue is unlike other training
techniques
temporal
window around an action�s onset � through variations in how the action is
performed and by attending to correlations between the action�s reliability in
producing reward and the state of contemporaneous stimuli, they are performing
local search in a potentially valuable neighbourhood
the state
and action spaces often containa�
natural hierarchical organisation that facilitates the search process
need to be
able to train with just observable behaviour, without looking at internal state
animals
seem to build models of important sensory cues �on demand�, using rewarded
actions as the context for identifying important sensory cues and for guiding
the perceptual model of the cue
discover, based on experience, those patterns or motions that seem to
matter and add them dynamically to their respective spaces � state space
discovery and action space discovery
use the context of a rewarded action to facilitate the classification
process
during
luring, animals delegate credit from the �follow your nose� state-action pair
to another pair
hierarchical
representation of state space � then we can �notice� that a given action is
more reliable when a whole �class� of states is active � further
exploration/refinement within that class of states might be fruitful
state space
is represented by a percept tree
percepts are atomic perception units, with arbitrarily complex logic,
whose job it is to recognise and extract features from the raw sensory data
if a percept is activated, the sensory data is passed recursively to the
percept�s children for more specific classification
how are new percepts generated???
percepts similar to codelets???
hierarchically organised though
fuck � have they already done what I wanted to do???
are there action equivalents of codelets???
cepstral coefficients???
motion percepts use a model that represents a
path through the space of possible motions � eh???
in RL terminology, a percept refers to a subset
of the entire current space
percept decomposition of state allows for a heuristic search through
potentially intractable state and state-action spaces
but it makes learning conjunctions of features harder � why???
because they fit into more than one place in the hierarchy???
actions =
identifiable patterns of motion through time
treating actions as verbs allows you to treat the action as a label
but isn�t good for the type of action space discovery needed for luring
instead, use a pose space that contains all of its possibly body
configurations, and an action is a path through this space
nodes in the pose graph are connected together in tangled, directed,
weighted graphs
can calculate a distance metric between two paths that measures their
similarity
the
representation of a particular state-action pair is an action tuple:
what to do
when
to what
for how long
why
like an augmented state-action pair in which
the state information is provided by an associated percept (when), the action
(what) is the label for a given path through pose space
action tuples are organised into groups and compete probabilistically
for action based on value and applicability � eh???
use �action tuple� and �percept-action pair�
interchangeably
the idea of
specificity of action being a bonus is interesting. Minsky consdieers it as a
criterion between Critics�
how are the
more specific children of states created???
using a classifier, that uses the reward context to help with
classification
how does it decide when to create a new state though???
cf parallel
terraced scan??? basically, a focused search, right???
well, y, but it�s clever the way they�ve found in advance the
potentially valuable areas
state-action
space discovery
the system is initially populated with only a few percept-action pairs
(i.e. action tuples) that represent general world states (i.e. reference
percepts at the top of the percept tree)
specialisation = over time, new percept-action pairs are added as the
system gathers evidence that a promising action associated with a given state
might be made even more reliable if associated with a specific child of the
state
in order for specialisation to occur (during the credit assignment
phase)
the value of the percept-action pair has to be above a threshold, i.e.
evidence that it would be valuable
it must have a child whose reliability/novelty is above a threshold
an
unsupervised technique such as k-means clustering can be employed to partition
the observed patterns into distinct clusters or classes � each cluster/class
represents a region of the state space
instead, they treat all patterns that occur contemporaneously with an
action that directly leads to a reward as belonging to the same cluster
are they
able to build training scripts, so that new dog-brains can be brought up to
some base level of expertise and evaluated with some standard tests???
SVM???
support vector machines
how do they work???